Efficient Implementation of the AI-REML Iteration for Variance Component QTL Analysis
نویسندگان
چکیده
Regions in the genome that affect complex traits, quantitative trait loci (QTL), can be identified using statistical analysis of genetic and phenotypic data. When restricted maximum-likelihood (REML) models are used, the mapping procedure is normally computationally demanding. We develop a new efficient computational scheme for QTL mapping using variance component analysis and the AI-REML algorithm. The algorithm uses an exact or approximative low-rank representation of the identity-by-descent matrix, which combined with the Woodbury formula for matrix inversion results in that the computations in the AI-REML iteration body can be performed more efficiently. For cases where an exact low-rank representation of the IBD matrix is available a-priori, the improved AI-REML algorithm normally runs almost twice as fast compared to the standard version. When an exact low-rank representation is not available, a truncated spectral decomposition is used to determine a low-rank approximation. We show that also in this case, the computational efficiency of the AI-REML scheme can often be significantly improved. Department of Mathematics and Physics, Mälardalen University, Sweden Division of Scientific Computing, Department of Information Technology, Uppsala University, Sweden Linnæus Center for Bioinformatics, Uppsala University
منابع مشابه
Non-iterative variance component estimation in QTL analysis.
In variance component quantitative trait loci (QTL) analysis, a mixed model is used to detect the most likely chromosome position of a QTL. The putative QTL is included as a random effect and a method is needed to estimate the QTL variance. The standard estimation method used is an iterative method based on the restricted maximum likelihood (REML). In this paper, we present a novel non-iterativ...
متن کاملPX × AI : algorithmics for better convergence in restricted maximum likelihood estimation
INTRODUCTION Maximising the (log) likelihood (logL) in restricted maximum likelihood (REML) estimation of variance components almost invariably represents a constrained optimisation problem. Iterative algorithms available to solve this problem differ substantially in computational resources needed, ease of implementation, sensitivity to choice of starting values and rates of convergence. One of...
متن کاملEmploying a Monte Carlo Algorithm in Newton-Type Methods for Restricted Maximum Likelihood Estimation of Genetic Parameters
Estimation of variance components by Monte Carlo (MC) expectation maximization (EM) restricted maximum likelihood (REML) is computationally efficient for large data sets and complex linear mixed effects models. However, efficiency may be lost due to the need for a large number of iterations of the EM algorithm. To decrease the computing time we explored the use of faster converging Newton-type ...
متن کاملEfficient Implementation of the New Restricted Maximum Likelihood Algorithms
Recently tridiagonalizafion and diagonalization have been proposed as methods to speed the EM algorithm for variance component estimation in restricted maximum likelihood. These methods require approximately the same computing resources, but only if the most efficient strategies are employed. When eigenvectors are explicitly calculated in diagonalization, computing requirements more than double...
متن کاملGENOMIC SELECTION DAIRRy-BLUP: A High-Performance Computing Approach to Genomic Prediction
In genomic prediction, common analysis methods rely on a linear mixed-model framework to estimate SNP marker effects and breeding values of animals or plants. Ridge regression–best linear unbiased prediction (RR-BLUP) is based on the assumptions that SNP marker effects are normally distributed, are uncorrelated, and have equal variances. We propose DAIRRy-BLUP, a parallel, Distributed-memory RR...
متن کامل